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Abstract 

We introduce a lattice model of protein conformations which is able 
to reproduce second structures of proteins (alpha-helices and beta- 
sheets). This model is based on the following two main ideas. First, 
we model backbone parts of amino acid residues in a peptide chain by 
edges in the cubic lattice which are not parallel to the coordinate axes. 
Second, we describe possible contacts of amino acid residues using a 
discrete model of the Ramachandran plot. 

This model allows to describe hydrogen bonds between the residues 
in the backbone of the peptide chain. In particular the lattice sec- 
ondary structures have the correct structure of hydrogen bonds. We 
also take into account the side chains of amino acid residues and their 
interaction. 

The expression for the energy of conformation of a lattice protein 
which contains contributions from hydrogen bonds in the backbone of 
the peptide chain and from interaction of the side chains is proposed. 
The lattice secondary structures are local minima of the introduced 
energy. 
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1 Introduction 

In the present paper we construct a model of lattice protein which describes 
the formation of lattice secondary structures (alpha-helices and beta-sheets). 
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We use lattice models of hydrogen bonds in the backbone of the peptide chain 
and lattice model of the Ramachandran plot which describes the geometric 
restrictions on the conformations of peptide chains. Before we describe our 
model let us mention some previous work. 

For previous discussions of lattice models of polymers see, e.g. [U [2j 
El IU [S]. For a review of physics of proteins see [5] and for a discussion 
of the problem of protein folding see [7]. In [8J a review of applications of 
combinatorial algorithms to protein folding was given. 

In [9], [10] lattice and off-lattice models of polymeric chains taking into 
account the directed hydrogen bond interaction were considered. The sec- 
ondary structures were discussed in connection with the long range order in 
such models. 

In the present paper we introduce a new lattice model of protein con- 
formations. Our aim is to construct a lattice model which will approximate 
the geometry of conformations of real proteins, in particular, the geometry 
of secondary structures of proteins. We model a protein conformation by 
a continuous broken line in the lattice Z 3 . The C^-atoms of the peptide 
chain will lie at some vertices of the lattice Z 3 , backbone parts of amino acid 
residues (which connect the consecutive C a -atoms in the peptide chain) will 
correspond to the edges of the conformation of a lattice protein. 

In the standard approach to lattice polymers the monomers in a polymer 
chains are described by edges of a lattice conformation which connect the 
nearest vertices in the lattice Z 3 . The angles between such edges can be 
equal either n/2 or n. Therefore the standard approach to lattice polymers 
involves serious restrictions on the possible geometry of a polymer chain. In 
particular this geometry considerably differs from the geometry of peptide 
chains where the angles between the consecutive backbone residues usually 
vary from approximately 100° (for alpha-helices) to 120° (for beta-sheets). 

We propose a lattice model of protein conformations where backbone 
parts of amino acid residues are modeled by edges in the lattice Z 3 which 
are not parallel to coordinate axes (in particular these edges will connect 
vertices of Z 3 which are not neighbors). The model is given by the set of 
possible edges, the set of possible angles between the edges and by the discrete 
analogue of the Ramachandran diagram (which describes the possible pairs 
of the backbone dihedral angles for C Q -atoms in a peptide chain). In order to 
built the discrete Ramachandran plot we consider orientations of the planes 
of amino acid residues (as lattice vectors which are approximately orthogonal 
to the corresponding edges of the lattice model). The orientation of the edge 
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describes the direction of the possible hydrogen bond. 

Thus in the model under consideration a conformation of the backbone of 
the peptide chain is described by the sequence of edges of the lattice model, 
where each edge is characterized by the beginning, end and the orientation 
vectors. The rules of selection for the possible angles between the consecutive 
edges and the pairs of their orientations will realize a discrete model of the 
Ramachandran plot. 

We show that the proposed model of lattice polymers allows to build 
lattice models of secondary structures of proteins such as alpha-helices and 
beta-sheets. Moreover these lattice secondary structures will have the correct 
structure of hydrogen bonds (described by the orientations vectors of the 
edges) . 

We also consider the side chains of amino acid residues (and the de- 
pendence of the Ramachandran plot on the side chain). We introduce an 
expression for the energy of a lattice protein which contains the contribu- 
tions from the hydrogen bonds of the backbone and interaction of the side 
chains. Lattice secondary structures in the model under consideration can 
be considered as the minima of the energy of lattice polymers where the ma- 
jority of hydrogen bonds in the backbone of the polymer chain are saturated 
and the geometric restrictions for conformations of the lattice peptide chains 
are satisfied. 

The structure of the present paper is as follows. 

In section 2 we describe the space of possible conformations for the model 
of a lattice polymer under consideration. 

In section 3 we describe lattice secondary structures in the considered 
model, namely alpha-helices and beta-sheets. 

In section 4 we construct the energy for our model of lattice polymer. 
The expression for the energy contains contributions from hydrogen bonds 
in the backbone of the chain and from interaction of side chains. 

In section 5 we present some conclusion based on the results of the present 
paper. 

2 Conformations of lattice proteins 

In the present section we describe the set of conformations for the model of 
lattice proteins under consideration. 

The backbone parts of amino acid residues will be modeled by some edges 
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in the lattice Z 3 which are not necessarily parallel to the coordinate axes. 
Here any edge of the lattice connects two vertices of the lattice. The cubic 
lattice Z 3 is the group generated by the unit vectors e\, e 2 , (parallel to the 
coordinate axes), the edges correspond to linear combinations of these unit 
vectors with integer coefficients. 

The conformation of the backbone of a peptide chain for the model of lat- 
tice protein under consideration will correspond to the map T : {1, . . . , N} —t 
Z 3 satisfying the conditions: 

1) Any pair i, i + 1 belonging to {1, . . . ,N} maps to vertices r(z), r(i + l) 
from Z 3 connected by an edge from the fixed set of edges, see below; 

2) A contact between the consecutive edges E^i = (T(i — 1), r(i)), E { = 
(r(i),r(i + 1)) of the chain should be allowed, see the definition [2] below; 

3) A distance between any two edges of the lattice polymer is larger or 
equal to two. 

The vertices of the lattice polymer chain are ordered along the chain. 
This order fixes the signs of the coefficients of a vector corresponding to an 
edge — a vector corresponds to the translation from the smaller to the larger 
vertex. 

Let us consider the set of the following edges in the lattice Z 3 : edges 
connect vertices related by translations of the form 

A x e x + A 2 e 2 + A 3 e 3} Ai G {0, ±1, ±2}, (1) 

where for each edge one of the coefficients Ai is equal to zero, the second 
coefficient is equal to ±1, and the third one is equal to ±2 (the coefficients 
equal to 0, ±1, ±2 can be selected in any order). 

Analogously, the vectors ([T]) can be considered as arbitrary lattice rota- 
tions (combinations of rotations by ir/2 with respect to the coordinate axes) 
of the vector 

2ei + e 2 . (2) 

We will also consider some additional edges (which we call nonstandard) 
which are not of the form (CO), namely corresponding to the lattice rotations 
of the vectors 

2ei, 2ei + e 2 + e 3 . (3) 

In the following we will not distinguish the edges and the corresponding 
vectors. 
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An edge in the lattice gives only a partial description of the backbone 
part of the amino acid residue. One has to describe the orientation of an 
edge, i.e. the direction in which an edge can have hydrogen bonds. 

Definition 1 Let an edge E correspond to some lattice rotation of a vector 
(0) or (TJJ). The orientation of the edge E is a vector e equal to some vector 
from the basis {ei,e 2 ,e 3 } of the lattice taken with the sign ±1, where e is 
orthogonal to the vector with the largest (with the modulus two) coefficient 
in the expansion of E over the basis {ei, eg, 63}. 

Therefore an edge with the orientation is described by the pair of vectors 
(E,e), any edge E possesses four possible orientations. 

The cosine of the angle between the two edges of the lattice of the de- 
scribed above form can take values from some finite set. We will say that 
the angle between the two consecutive edges of the form (jTJ is allowed if the 
cosine of this angle takes values from the set {—0.6, —0.4, —0.2} (where the 
angle between the consecutive edges E, E' is the angle between vectors E, 
-E>). 

The allowed angles between the nonstandard edges ([3]), and between the 
standard and nonstandard edges are described below in formulas (E]H© 

Remark The angle between the covalent bonds at the C a -atom of peptide 
chain is equal approximately to 109°. In our model the angle between the 
consecutive edges is variable. This is related to the observation that the edge 
connects the two consecutive C Q -atoms and is not parallel to the correspond- 
ing covalent bonds. The angle between the edges will vary with the rotation 
of the edges and depends on the corresponding dihedral angles. The angle 
between the edges belongs to some interval (approximately between 100° and 
120°, see for example j6]), in our discrete model the cosine of this angle will 
take values from the set {—0.6, —0.4, —0.2} (for standard edges ([I])). 

We describe the side chain of the z-th amino acid residue (connected to 
the i-th C a -atom of the chain) with the help of a vector Sj with the initial 
point in the i-th vertex of the lattice polymer. We denote by (Si, Ri) a pair 
(the position of the side chain in the lattice, the kind of the side chain). Thus 
the vector Si connects the vertex T(i) of the lattice peptide chain with the 
point Si G Z 3 where the side chain of the kind Ri is situated. 
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Some examples of possible conformations T of lattice proteins satisfying 
the described above conditions (without side chains) are shown at figures 1, 
2, 3. Figures 1, 2 describe the examples of lattice alpha-helices and beta- 
sheets. Fig. 3 shows the left helix which can not be found in real proteins 
but satisfies the above conditions for edges and angles between the edges. 

In order to eliminate the left helix and similar conformations one has to 
take into account the Ramachandran diagram (selection rules for possible 
pairs of dihedral angles at C a -atoms), see for example [5]. In our lattice 
model the discrete analogue of the Ramachandran diagram is defined by the 
selection rules for possible pairs of orientations for the consecutive edges of 
the chain. 

The next definition describes the discrete Ramachandran plot and de- 
scribes the positions of the corresponding side chains. 

Definition 2 The list of allowed contacts for consecutive edges in the lattice 
polymer chain has the following form. 

The consecutive edges (E, e), (E', e), where E, E' have the form (TJP 

E = A x e x + A 2 e 2 + A 3 e 3 , E' = A\e x + A' 2 e 2 + A' 3 e 3 (4) 

with the orientations e, e' have the allowed contact, if their form (TJJ) and 
orientations (up to lattice rotations) are described in one of the cases 1-6 
below: 

• 1 ) The case of alpha-helix, see Fig. 1 . The cosine of the angle between 
the edges is equal to —0.2, the orientations of the consecutive edges 
coincide: 

E = 2e 2 — e 3 , E' = 2e x — e 3 , e — e' — —e 3 s = —2e\. 

• 2) The case of beta-sheet, see Fig. 2. The cosine of the angle between 
the edges is equal to —0.6, the orientations of the consecutive edges are 
opposite: 

E = 2ei + e 2 , E' = 1e x — e 2 , e = e 3 , e' = — e 3 , s = 2e 2 . 

The cases 3-6 correspond to loops: the cosine of the angle between the 
edges is equal to —0.4, the orientations e, e' of the consecutive edges are 
orthogonal: 
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• 3) The cosine of the angle between the edges is equal to —0.4, 

E = 2ei + e 2 , E' = e x — 2e 3 , e = e 2 , e' = —e\ s = e\ + e 3 . 

• 4) The cosine of the angle between the edges is equal to —0.4, 

E = 2e x + e 2 , E' = ei + 2e 3 , e = e 3 , e' = -e 2 s = 2e 2 . 

Tie next two cases should be allowed only for glycine. 

• 5) The cosine of the angle between the edges is equal to —0.4, 

E = 2e x + e 2 , E' = 2e 2 - e 3 , e = e 2 , e' = e x , s = ei — e 2 . 



• 6) Tie cosine of the angle between the edges is equal to —0.4, 

E = 2ei + e 2 , £" = 2e 2 + e 3 , e = e 2 , e' = — e 3 s = ei + e 3 . 

Tie cases 7-10 of contacts of consecutive edges below describe contacts 
of standard and non-standard edges for beta turns (or beta bends). 

The sequence of edges with orientations (up to lattice rotations) for a 
beta turn of the type 1 is described in formulas j2J)-(G3), for a beta turn of 
the type 2 the formula (0) should be replaced by ( flD)) (where the edge E 2 
has the opposite orientation) . The sequence of edges for a beta turn has the 
form E0E1E2E3E4, where E and E4 belong to beta-strands, E1E2E3 belong 



to the beta turn, see Fig. 4. 

E = 2 ei - e 2 , e (0) = -e 3 ; (5) 

E 1 = 2ei + e 2 + e 3 , e^ 1 ) = e 3 ; (6) 

E 2 = 2e 3 , e (2) = e 2 ; (7) 

E 3 = -2e x - e 2 + e 3 , e< 3 > = e 3 ; (8) 

E A = -2e x + e 2 , e (4) = -e 3 . (9) 

E 2 = 2e 3 , = -e 2 . (10) 



The edges E\E 2 E 3 which belong to the beta turn have a non-standard form 
which can not be described by formula (Qp. 
The contacts for a beta turn have the form: 
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• 7) Contact of the edges E , E 1 given by j3J), (0): Si = — 2e 2 . 

• 8) Contact of the edges E\, E 2 given by (0), (0): s 2 = t\ — e 2 — e 3 . 

• 9) Contact of the edges E 2 , E 3 given by (GJ), (0): S3 = e\ — e 2 + e 3 . 

• 10) Contact of the edges E 3 , E A given by (EJ), (0): s 4 = — 2e 2 . 

If the angles between the consecutive edges of the form (pQ) have the 
cosines —0.2 or —0.6, then the orientations of the edges are defined by the 
conformation of the polymer chain (as for real alpha-helices and beta-sheets). 
If the cosine of the angle between the consecutive edges of the form (TO 
equals —0.4, then there are several possibilities for orientations of the edges. 
For glycine we have two additional possibilities for allowed contacts. Non- 
standard edges were used for the description of beta turns. 

Definitions The set (T,{e w }) is a conformation of a lattice protein if: 

i) T is an embedding 

V} >Z :! : 

where i, i + 1 belonging to {1, ... , iV} map to the vertices T(i + 1) in 
Z 3 connected by an edge Ei, 

ii) the edges E i; i = 1, . . . , N — 1 .have the form of lattice rotations of (0), 
(0) and have some orientations e^; 

Hi) the two consecutive edges for F have the allowed contact in the sense 
of definition \2\ 

iv) the distance between any two edges of the lattice polymer and between 
the side chains is allowed (i.e. larger or equal to two). 

The introduced lattice model is based on the possibility to describe the ge- 
ometry of the polymer chain (i.e. the angles between the edges corresponding 
to monomers) using the edges which connect vertices which are not neighbors 
in the lattice Z 3 , and on the discrete model of the Ramachandran plot. We 
will show that the defined model approximates well secondary structures of 
proteins. We will propose an expression for the energy of a conformation of 
lattice protein based on the description of hydrogen bonds between the back- 
bone parts of the monomers and wills show that the minima of this energy 
will coincide with the lattice secondary structures. 
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3 Secondary structures 



Let us describe lattice models of protein secondary structures and show that 
these structures are compatible with our model in the sense of definition |3j 
Figures 1 and 2 shows the lattice alpha-helix and beta-sheet (in projection). 
The alpha-helix can be directed along any of the coordinate axes, the beta- 
sheet can be parallel to any of the coordinate planes and beta-strands can be 
directed along any of the coordinate axes in this plane. Let us note that the 
(infinite) alpha-helix is invariant with respect to the following transformation 
- the clockwise rotation by 90° with the translation forward by one step of 
the lattice. 

An edge of the lattice model describes a backbone part of a monomer 
(amino acid residue). One turn of a lattice alpha-helix contains four edges 
(3.6 monomers in real proteins, [6]). The cosine of the angle between the 
consecutive edges in a lattice alpha-helix is equal to —0.2, and in a lattice 
beta-sheet it is equal to —0.6, which are close to the values (of the cosines of 
the angles) for real proteins. It is natural to take the step of the lattice Z 3 (the 
distance between the nearest points in Z 3 ) to be equal to 1.5 angstrom (this 
will give realistic sizes for the models of secondary structures, in particular 
for the distance between the turns in alpha-helix). 

The lattice models shown at figures 1 and 2 are similar to the real sec- 
ondary structures one finds in proteins. The orientations of the edges in 
lattice secondary structures have the following form. In alpha-helix the ori- 
entation vectors are parallel to the helix and are parallel to each other (this 
corresponds to the presence of hydrogen bonds between the edges obtained by 
a translation by a unit vector along the helix). In beta-sheet the orientation 
vectors are directed from the edges to the parallel edges in the parallel beta- 
strands, the orientation vectors are antiparallel for the consecutive edges in 
the chain (which again reflects the picture of hydrogen bonds between the 
strands in a beta-sheet). 

The orientations of edges are parallel to the coordinate axes, thus in the 
model under consideration the secondary structures will be parallel to the 
coordinate feature which is natural for lattice models. 

Figures 5 and 6 show the lattice alpha-helix and beta-sheet with side 
chains. 
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4 A model of energy of a lattice protein 



Secondary structures in proteins are stabilized by hydrogen bonds in the 
backbone of the peptide chain. Any amino acid residue can obtain two 
hydrogen bonds which should be approximately orthogonal to the backbone 
of the peptide chain. 

Let us consider the translations of an edge E by ±4e (where e is the unit 
vector of orientation of E). We will say that an edge E with orientation e 
can have hydrogen bonds with the described translation if the orientations 
of the edge E and its translation coincide. 

We introduce the expression for the energy of a conformation of a lattice 
protein which contains the contributions from hydrogen bonds in the back- 
bone of the chain and from the interaction between the side chains of amino 
acid residues. 

A conformation of a lattice protein of the length N is defined by the map 

T : {1,...,N} ->• Z 3 

satisfying the set of conditions of definition [3j The edge E; L = (T(i), T(i + 1)) 
corresponds to the translation from T(i) to T(i + 1), the orientation of this 
edge will be denoted by 

We introduce the energy of the conformation T of a lattice protein as 
follows 

E(T) = E HB (T) + E SC (T) = 

N-l N-l N-l 

= - J2 W + 4e« ^)*( e W |e W) + - ]T 0(S ; . S ,) M h m,- (11) 

i,j=l i=l *,i=2 

The first sum in the expression above describes the contribution to the 
energy from hydrogen bonds in the backbone of the peptide chain. Here 
5(Ei + 4e^, Ej) is equal to one when the edges Ej and E{ + 4e^ (the trans- 
lation of Ei along the orientation vector e^) coincide and is equal to zero 
otherwise. Note that the edges Ej and E^ + Ae^ should coincide as sets (i.e. 
the corresponding vectors might be antiparallel). Analogously 6(e^\e^) is 
equal to one when the orientation vectors of the i-th and the j-th edges 
coincide (i.e. are parallel) and is equal to zero otherwise. 

The second sum in ([IT]) describes energetic penalties for the use of non- 
standard edges. The energetic penalty p(Ei) is equal to zero if the edge Ei is 
standard and is equal to some positive value otherwise. Nonstandard edges 
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are involved in beta-turns where energetic penalties are related to the torsion 
of the peptide chain. Energetic penalties (say for beta-turns in beta-sheets) 
can be compensated by the energies of hydrogen bonds and interaction of 
side chains. 

For the edges E\ and E 3 of the beta-turn, see ©-Q, there exists a 
hydrogen bond. We ignore this bond in our model (this can be considered 
as a variant of energetic penalty). 

The third sum in ( ITT]) describes the contribution to the energy of a lattice 
protein from the interaction of side chains of amino acid residues. Here Si and 
Ri are the position and the type of the z-th side chain, the function 6(Si, Sj) 
depends on the distance between the side chains Si and Sj (for example is 
equal to one for distances not larger than 4 in the lattice and decreases to 
zero for a distance larger than 5), the matrix Mr^r. (the Miyazawa-Jernigan 
matrix describes the interaction of side chains (the indices of this matrix 
enumerate the set of 20 amino acids). 

Remark Let us discuss the lattice aplha-helix, see Fig. 5, and beta-sheet, 
see Fig. 6. It is easy to see that for these lattice conformations all con- 
tributions to the energy ffTTl) related to hydrogen bonds in the backbone of 
the lattice peptide chain (excluding the beta-turns) are present in the first 
sum in (fTTj) . i.e. any edge excluding the beta-turns possesses two hydrogen 
bonds. This corresponds to the known property that all hydrogen bonds 
for the backbone of the peptide chain in secondary structures are saturated. 
Here we have to take into account the boundary effects - for edges at the 
boundaries of the alpha-helix and beta-sheet only one hydrogen bond per 
edge is saturated. 

Moreover the pairs of the nearest side chains of amino acid residues for 
lattice secondary structures have the distance four between the side chains, 
therefore these side chains interact (i.e. give contributions to the energy 

(HE))- 

Therefore the energy (TTTj) of lattice secondary structures (alpha-helices 
and beta-sheets) is low. For short lattice polymer chains the considered 
lattice secondary structures will be the minima of energy, for long lattice 
polymer chains combinations of interacting secondary structures (say of two 
parallel helices) will have the lower energy due to the interaction between 
the side chains of the different secondary structures. 

It would be interesting to investigate the following question: do there 
exist alternative lattice secondary structures, i.e. conformations of lattice 
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polymers which consist mainly of standard edges (DQ), such that for any stan- 
dard edge in the conformation both hydrogen bonds are saturated (again, 
taking into account the boundary effects). 

5 Conclusion 

In the present paper we have constructed a lattice model for the conformation 
of a protein. For this model the main secondary structures of proteins (alpha- 
helices and beta-sheets) are local minima of the energy of a lattice polymer. 
We approximate the geometry of peptide chains using edges in the lattice 
Z 3 which are not parallel to the coordinate axes and take into account the 
discrete form of the Ramachandran plot. 

The constructed model is able to describe the hydrogen bonds in the 
backbone of a peptide chain. In particular the lattice alpha-helices and 
beta-sheets have the correct picture of hydrogen bonds. The model also 
takes into account the interaction of side chains. 

In this model the local minima of the energy (II ip take the form of com- 
binations of secondary structures since this is the only way to saturate a 
large number of hydrogen bonds in the backbone of the peptide chain and 
satisfy the geometric restrictions on the form of the peptide chain (as for real 
proteins) . 
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